Head-Driven Statistical Models for Natural Language Parsing

نویسنده

  • Michael Collins
چکیده

HEAD DRIVEN STATISTICAL MODELS FOR NATURAL LANGUAGE PARSING Michael Collins Supervisor Professor Mitch Marcus Statistical models for parsing natural language have recently shown considerable suc cess in broad coverage domains Ambiguity often leads to an input sentence having many possible parse trees statistical approaches assign a probability to each tree thereby rank ing competing trees in order of plausibility The probability for each candidate tree is calculated as a product of terms each term corresponding to some sub structure within the tree The choice of parameterization is the choice of how to break down the tree There are two critical questions regarding the parameterization of the problem What linguistic objects e g context free rules parse moves should the model s parameters be associated with I e How should trees be decomposed into smaller fragments How can this choice be instantiated in a sound probabilistic model This thesis argues that the locality of a lexical head s in uence in a tree should motivate modeling choices in the parsing problem In the nal parsing models a parse tree is repre sented as the sequence of decisions corresponding to a head centered top down derivation of the tree Independence assumptions then follow naturally leading to parameters that encode the X bar schema subcategorization ordering of complements placement of ad juncts lexical dependencies wh movement and preferences for close attachment All of these preferences are expressed by probabilities conditioned on lexical heads

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

تأثیر ساخت‌واژه‌ها در تجزیه وابستگی زبان فارسی

Data-driven systems can be adapted to different languages and domains easily. Using this trend in dependency parsing was lead to introduce data-driven approaches. Existence of appreciate corpora that contain sentences and theirs associated dependency trees are the only pre-requirement in data-driven approaches. Despite obtaining high accurate results for dependency parsing task in English langu...

متن کامل

An Alternative to Head-Driven Approaches for Parsing a (Relatively) Free Word-Order Language

Applying statistical parsers developed for English to languages with freer wordorder has turned out to be harder than expected. This paper investigates the adequacy of different statistical parsing models for dealing with a (relatively) free word-order language. We show that the recently proposed RelationalRealizational (RR) model consistently outperforms state-of-the-art Head-Driven (HD) model...

متن کامل

An improved joint model: POS tagging and dependency parsing

Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...

متن کامل

Head-Driven Parsing for Word Lattices

We present the first application of the head-driven statistical parsing model of Collins (1999) as a simultaneous language model and parser for largevocabulary speech recognition. The model is adapted to an online left to right chart-parser for word lattices, integrating acoustic, n-gram, and parser probabilities. The parser uses structural and lexical dependencies not considered by ngram model...

متن کامل

Probabilistic Models for Disambiguation of an HPSG-Based Chart Generator

We describe probabilistic models for a chart generator based on HPSG. Within the research field of parsing with lexicalized grammars such as HPSG, recent developments have achieved efficient estimation of probabilistic models and high-speed parsing guided by probabilistic models. The focus of this paper is to show that two essential techniques – model estimation on packed parse forests and beam...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Linguistics

دوره 29  شماره 

صفحات  -

تاریخ انتشار 2003